fix(robot-server): fail proto load gracefully#18259
Merged
Merged
Conversation
SyntaxColoring
approved these changes
May 5, 2025
SyntaxColoring
approved these changes
May 5, 2025
Contributor
SyntaxColoring
left a comment
There was a problem hiding this comment.
Affirming my former (accidental) approval with this (intentional) approval. Thank you!
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Closes EXEC-1450
7be77cb to
1bd8021
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## edge #18259 +/- ##
=======================================
Coverage 23.62% 23.62%
=======================================
Files 3055 3055
Lines 255734 255734
Branches 30114 30114
=======================================
Hits 60419 60419
Misses 195301 195301
Partials 14 14
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
ddcc4
pushed a commit
that referenced
this pull request
May 16, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450
ddcc4
pushed a commit
that referenced
this pull request
May 16, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450
ddcc4
pushed a commit
that referenced
this pull request
May 16, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450
ddcc4
pushed a commit
that referenced
this pull request
May 16, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450
ddcc4
pushed a commit
that referenced
this pull request
May 16, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450
ddcc4
pushed a commit
that referenced
this pull request
May 16, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450
ddcc4
pushed a commit
that referenced
this pull request
May 17, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450
ddcc4
pushed a commit
that referenced
this pull request
May 17, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450
ddcc4
pushed a commit
that referenced
this pull request
May 17, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450
ddcc4
pushed a commit
that referenced
this pull request
May 17, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
ddcc4
pushed a commit
that referenced
this pull request
May 17, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
ddcc4
pushed a commit
that referenced
this pull request
May 17, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
ddcc4
pushed a commit
that referenced
this pull request
May 17, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
ddcc4
pushed a commit
that referenced
this pull request
May 17, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
ddcc4
pushed a commit
that referenced
this pull request
May 17, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
ddcc4
pushed a commit
that referenced
this pull request
May 19, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
ddcc4
pushed a commit
that referenced
this pull request
May 19, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
ddcc4
pushed a commit
that referenced
this pull request
May 19, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
ddcc4
pushed a commit
that referenced
this pull request
May 20, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
ddcc4
pushed a commit
that referenced
this pull request
May 20, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
ddcc4
pushed a commit
that referenced
this pull request
May 22, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
ddcc4
pushed a commit
that referenced
this pull request
May 23, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
ddcc4
pushed a commit
that referenced
this pull request
May 24, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
ddcc4
pushed a commit
that referenced
this pull request
May 24, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
ddcc4
pushed a commit
that referenced
this pull request
May 29, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
ddcc4
pushed a commit
that referenced
this pull request
May 29, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
ddcc4
pushed a commit
that referenced
this pull request
May 29, 2025
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database. The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration. The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug. Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized. Closes EXEC-1450 (cherry picked from commit fe8d402)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
"Loading" a protocol, which is what we generically call the process to do everything required to get it to run, can now fail. We didn't think about this failure mode, and if it happened then we would have created some but not all of our run tracking resources, and our API would return inconsistent results (which is the bug). The orchestrator store would save the orchestrator object in a member, which then means that it would have a "current run", but since we didn't link resource allocation (or at least binding) and initialization, the exception raised after that happens would mean it never went into the database.
The immediate fix is to not bind the orchestrator object to the _orchestrator data member until right before we return, after we do all the possibly-failing orchestration.
The thing that would make this fail is to have a python protocol that was syntactically correct (passes ast.parse()) but did something at the file scope that would cause an error (like an import failure - a division by 0 would work too). This will fail when we exec the protocol to check for the presence of runtime parameters, which happens during the execution of this function, triggering the bug.
Luckily this is one of those "the error is presented badly" type of problems so it doesn't have to be prioritized.
Closes EXEC-1450